Hive Insert into Partition Table

Inserting data into a partitioned Hive table is quite different compared to relational databases. You must specify the partition column in your insert command.

Below are the some methods that you can use when inserting data into a partitioned table in Hive.

  • Insert into Hive partitioned Table using Values clause
  • Inserting data into Hive Partition Table using SELECT clause
  • Named insert data into Hive Partition Table
With Partition Table

INSERT INTO zipcodes PARTITION(state='FL') VALUES (891,'US','TAMPA',33605);

INSERT INTO zipcodes PARTITION(state) VALUES (891,'US','TAMPA',33605,'FL');

Insert Overwrite

INSERT OVERWRITE TABLE
tablename1 [PARTITION
(partcol1=val1,
partcol2=val2 …) [IF
NOT EXISTS]] select_
statement1 FROM from_

Overwrites data in a table unless IF NOT EXISTS is specified for a partition. The data added is selected using a SELECT statement.

The immutable property is supported as of Hive 0.13.0 but does not affect INSERT OVERWRITE TABLE. The default is "immutable"="false".

INSERT INTO TABLE

INSERT INTO TABLE
tablename1 [PARTITION
(partcol1=val1,
partcol2=val2 …)]
select_statement1 FROM
from_statement;

Appends data into a table or partition. The existing data is retained.

If TBLPROPERTIES ("immutable" = "true") is set and a table already has some data, the new data is not added with INSERT INTO statement. If a table is empty, the new data is added with INSERT INTO even if TBLPROPERTIES ("immutable"="true").

No comments:

Post a Comment